| Non-Rationalised Economics NCERT Notes, Solutions and Extra Q & A (Class 9th to 12th) | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9th | 10th | 11th | 12th | ||||||||||||||||
Chapter 6 Measures Of Dispersion
Introduction to Measures of Dispersion
The Limitations of Averages
In the previous chapter, you learned how to summarize a dataset into a single representative value using measures of central tendency (like mean, median, and mode). However, this single value does not reveal the variability or dispersion present in the data. An average tells us about the central location of a distribution but says nothing about how the values are spread out around that central point.
Example 1. Three friends, Ram, Rahim, and Maria, are discussing their family incomes. Each of them calculates that the average income per member in their family is ₹15,000.
| Sl. No. | Ram's Family (₹) | Rahim's Family (₹) | Maria's Family (₹) |
|---|---|---|---|
| 1. | 12,000 | 7,000 | 0 |
| 2. | 14,000 | 10,000 | 7,000 |
| 3. | 16,000 | 14,000 | 8,000 |
| 4. | 18,000 | 17,000 | 10,000 |
| 5. | - | 20,000 | 50,000 |
| 6. | - | 22,000 | - |
| Total Income | 60,000 | 90,000 | 75,000 |
| Average Income | 15,000 | 15,000 | 15,000 |
Although the average income is the same for all three families, the distribution of individual incomes is very different. In Ram's family, the incomes are closely clustered around the average. In Rahim's family, they are more spread out, and in Maria's family, the differences are the highest, with one member earning a very high salary and another earning nothing.
This example clearly shows that knowledge of only the average is insufficient. We need another value that reflects the quantum of variation to better understand the distribution.
What is Dispersion?
Dispersion is the extent to which values in a distribution differ from the average of that distribution. It measures the spread or variability of the data. A measure of dispersion can reveal information about inequalities. For example, while per capita income gives an average, a measure of dispersion can tell us about income inequalities between different strata of society.
Common Measures of Dispersion
To quantify the extent of variation, there are several measures:
- Range
- Quartile Deviation
- Mean Deviation
- Standard Deviation
Range and Quartile Deviation are measures based on the spread of values, while Mean Deviation and Standard Deviation are based on the deviations of values from an average.
Measures Based on the Spread of Values
These measures calculate dispersion by looking at the spread within which the values of a distribution lie.
1. Range
The Range (R) is the simplest measure of dispersion. It is the difference between the largest (L) and the smallest (S) value in a distribution.
$R = L - S$
A higher value of the range implies higher dispersion.
Limitations of Range
- It is unduly affected by extreme values (outliers).
- It is not based on all the values in the distribution; it only uses the two extreme observations.
- It cannot be calculated for open-ended frequency distributions (where the lowest or highest class limit is not specified).
Despite its limitations, the range is frequently used because of its simplicity. For example, we often see the maximum and minimum temperatures of cities, which is a form of range.
2. Quartile Deviation
To overcome the problem of extreme values, the Quartile Deviation (Q.D.) is used. It is based on the middle 50% of the data and is therefore not affected by outliers. It is calculated using the upper quartile ($Q_3$) and the lower quartile ($Q_1$).
First, the Interquartile Range is calculated, which is the difference between the third and first quartiles:
Interquartile Range = $Q_3 - Q_1$
Half of the interquartile range is called the Quartile Deviation (Q.D.), also known as the Semi-Interquartile Range.
Quartile Deviation (Q.D.) = $\frac{Q_3 - Q_1}{2}$
Calculation of Quartile Deviation
The process involves finding the values of $Q_1$ and $Q_3$.
- For ungrouped data:
$Q_1 =$ size of $(\frac{n+1}{4})^{th}$ item
$Q_3 =$ size of $3(\frac{n+1}{4})^{th}$ item
- For continuous series:
First, find the quartile class where the $(\frac{n}{4})^{th}$ item (for $Q_1$) or $(\frac{3n}{4})^{th}$ item (for $Q_3$) lies. Then, use the formula:
$Q_1 = L + \frac{(\frac{n}{4} - c.f.)}{f} \times i$
Where L is the lower limit of the quartile class, c.f. is the cumulative frequency of the preceding class, f is the frequency of the quartile class, and i is the class interval.
Quartile deviation can be calculated for open-ended distributions and is a good measure of dispersion when extreme values are present.
Measures of Dispersion from Average
Range and Quartile Deviation give an idea about the spread of values but do not measure how far the values are from their average. Two important measures based on the deviation of values from their average are Mean Deviation and Standard Deviation.
Since the average is a central value, some deviations from it are positive and some are negative. The sum of deviations from the Arithmetic Mean is always zero. To overcome this, Mean Deviation ignores the signs of the deviations, while Standard Deviation squares them.
1. Mean Deviation (M.D.)
The Mean Deviation is the arithmetic mean of the absolute differences (deviations) of the values from their average. The average used can be either the arithmetic mean or the median.
Calculation of Mean Deviation
(a) From Arithmetic Mean for ungrouped data:
M.D. $(\bar{x}) = \frac{\sum|d|}{n} = \frac{\sum|X - \bar{X}|}{n}$
(b) From Median for ungrouped data:
M.D. (Median) = $\frac{\sum|d|}{n} = \frac{\sum|X - \text{Median}|}{n}$
(c) For continuous distribution (from mean):
M.D. $(\bar{x}) = \frac{\sum f|d|}{\sum f} = \frac{\sum f|m - \bar{X}|}{\sum f}$
Limitations of Mean Deviation
Mean deviation is based on all values. However, its main limitation is that it ignores the algebraic signs of the deviations, which makes it appear unmathematical and less suitable for further statistical analysis.
2. Standard Deviation (S.D.)
The Standard Deviation is the most widely used measure of dispersion. It is defined as the positive square root of the mean of the squared deviations from the arithmetic mean. It is denoted by the Greek letter sigma ($\sigma$).
Variance and Standard Deviation
The mean of the squared deviations from the mean is called the Variance ($\sigma^2$). The Standard Deviation is the positive square root of the variance.
Variance ($\sigma^2$) = $\frac{\sum(X - \bar{X})^2}{n}$
Standard Deviation ($\sigma$) = $\sqrt{\frac{\sum(X - \bar{X})^2}{n}}$
Calculation of Standard Deviation (for continuous frequency distribution)
Several methods can be used, with the step-deviation method being one of the most common for simplifying calculations.
$\sigma = \sqrt{\frac{\sum fd'^2}{\sum f} - \left(\frac{\sum fd'}{\sum f}\right)^2} \times c$
Where $d' = \frac{m - A}{c}$ (m = midpoint, A = assumed mean, c = common factor).
Properties of Standard Deviation
- It is based on all values.
- It is always calculated from the mean.
- It is independent of origin (a change in the origin does not affect S.D.) but not of scale (if values are divided/multiplied by a constant, the S.D. is also divided/multiplied by that constant).
- It is suitable for further algebraic treatment and is used in many advanced statistical problems.
Absolute and Relative Measures of Dispersion
Absolute vs. Relative Measures
All the measures discussed so far (Range, Q.D., M.D., S.D.) are absolute measures of dispersion. They express the variation in the same units as the original data (e.g., rupees, kilograms).
Weaknesses of Absolute Measures
- Difficult to Compare: They can be misleading when comparing the variability of two different distributions, especially when their averages differ significantly or their units of measurement are different. For instance, a range of ₹500 in the sales of a small vendor is not comparable to a range of ₹30,000 for a large departmental store.
- Unit Dependent: The value of the measure changes if the unit of measurement changes (e.g., from kilometers to meters).
Relative Measures of Dispersion
To overcome these problems, relative measures of dispersion are used. These are expressed as ratios or percentages and are free from the units of measurement, making them suitable for comparison. Each absolute measure has a corresponding relative measure, often called a "coefficient".
| Absolute Measure | Relative Measure (Coefficient) | Formula |
|---|---|---|
| Range | Coefficient of Range | $\frac{L - S}{L + S}$ |
| Quartile Deviation | Coefficient of Quartile Deviation | $\frac{Q_3 - Q_1}{Q_3 + Q_1}$ |
| Mean Deviation | Coefficient of Mean Deviation | $\frac{\text{M.D.}(\bar{x})}{\bar{x}}$ or $\frac{\text{M.D.}(\text{Median})}{\text{Median}}$ |
| Standard Deviation | Coefficient of Variation (C.V.) | $\frac{\sigma}{\bar{X}} \times 100$ |
The Coefficient of Variation (C.V.) is the most commonly used relative measure of dispersion. It is expressed as a percentage and is used to compare the variability, consistency, or uniformity of two or more series. A series with a lower C.V. is considered more consistent or stable.
Lorenz Curve: A Graphic Measure of Dispersion
The Lorenz Curve is a graphical method used to estimate and visualize inequalities in a distribution, particularly for income and wealth. While other measures provide a numerical value of dispersion, the Lorenz curve provides a visual representation.
Construction of the Lorenz Curve
The following steps are required to construct a Lorenz Curve:
- Data on the variable (e.g., income) and the number of individuals (e.g., employees) are arranged in classes.
- The cumulative frequencies of the number of individuals and the cumulative values of the variable are calculated.
- These cumulative values are then converted into percentages of their respective totals.
- The cumulative percentages of the individuals are plotted on the horizontal axis (X-axis), and the cumulative percentages of the variable are plotted on the vertical axis (Y-axis).
- The plotted points are joined by a smooth curve. This curve is the Lorenz Curve.
- A straight diagonal line is drawn from the origin (0, 0) to the point (100, 100). This is called the Line of Equal Distribution.
Studying the Lorenz Curve
The Line of Equal Distribution represents a situation of perfect equality (e.g., the bottom 20% of people earn 20% of the total income, the bottom 50% earn 50%, and so on). The farther the Lorenz Curve is from this line, the greater the inequality present in the distribution.
The Lorenz Curve is especially useful for comparing the degree of inequality in two or more distributions by drawing their curves on the same graph. The curve that is farthest from the line of equal distribution represents the distribution with the highest inequality.
NCERT Questions Solution
Question 1. A measure of dispersion is a good supplement to the central value in understanding a frequency distribution. Comment.
Answer:
Question 2. Which measure of dispersion is the best and how?
Answer:
Question 3. Some measures of dispersion depend upon the spread of values whereas some are estimated on the basis of the variation of values from a central value. Do you agree?
Answer:
Question 4. In a town, 25% of the persons earned more than Rs 45,000 whereas 75% earned more than 18,000. Calculate the absolute and relative values of dispersion.
Answer:
Question 5. The yield of wheat and rice per acre for 10 districts of a state is as under:
| District | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| Wheat | 12 | 10 | 15 | 19 | 21 | 16 | 18 | 9 | 25 | 10 |
| Rice | 22 | 29 | 12 | 23 | 18 | 15 | 12 | 34 | 18 | 12 |
Calculate for each crop,
(i) Range
(ii) Q.D.
(iii) Mean deviation about Mean
(iv) Mean deviation about Median
(v) Standard deviation
(vi) Which crop has greater variation?
(vii)Compare the values of different measures for each crop.
Answer:
Question 6. In the previous question, calculate the relative measures of variation and indicate the value which, in your opinion, is more reliable.
Answer:
Question 7. A batsman is to be selected for a cricket team. The choice is between X and Y on the basis of their scores in five previous tests which are:
| X | 25 | 85 | 40 | 80 | 120 |
| Y | 50 | 70 | 65 | 45 | 80 |
Which batsman should be selected if we want,
(i) a higher run getter, or
(ii) a more reliable batsman in the team?
Answer:
Question 8. To check the quality of two brands of lightbulbs, their life in burning hours was estimated as under for 100 bulbs of each brand.
| Life (in hrs) | No. of bulbs | |
|---|---|---|
| Brand A | Brand B | |
| 0–50 | 15 | 2 |
| 50–100 | 20 | 8 |
| 100–150 | 18 | 60 |
| 150–200 | 25 | 25 |
| 200–250 | 22 | 5 |
| Total | 100 | 100 |
(i) Which brand gives higher life?
(ii) Which brand is more dependable?
Answer:
Question 9. Averge daily wage of 50 workers of a factory was Rs 200 with a standard deviation of Rs 40. Each worker is given a raise of Rs 20. What is the new average daily wage and standard deviation? Have the wages become more or less uniform?
Answer:
Question 10. If in the previous question, each worker is given a hike of 10 % in wages, how are the mean and standard deviation values affected?
Answer:
Question 11. Calculate the mean deviation using mean and Standard Deviation for the following distribution.
| Classes | Frequencies |
|---|---|
| 20–40 | 3 |
| 40–80 | 6 |
| 80–100 | 20 |
| 100–120 | 12 |
| 120–140 | 9 |
| Total | 50 |
Answer:
Question 12. The sum of 10 values is 100 and the sum of their squares is 1090. Find out the coefficient of variation.
Answer: